STIRLING’S FORMULA 
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1. INTRODUCTION 


Our goal is to prove the following asymptotic estimate for n!, called Stirling’s formula. 


n” n! 
Theorem 1.1. Asn — oo, n! ~ —V2rn. That is, lim -—————— = 1 

er noo (n” /e”)./2an 
Example 1.2. Set n = 10: 10! = 3628800 and (10!°/e!°),/27(10) = 3598695.61.... The 
difference between these, around 30104, is rather large by itself but is less than 1% of the 
value of 10!. That is, Stirling’s approximation for 10! is within 1% of the correct value. 


Stirling’s formula can also be expressed as an estimate for log(n!): 
1 1 
(1.1) log(n!) = nlogn—n+ 3 leant 3 les(2n) + €n, 
where €, > 0 as n > 0. 


Example 1.3. Taking n = 10, log(10!) ~ 15.104 and the logarithm of Stirling’s approxi- 
mation to 10! is approximately 15.096, so log(10!) and its Stirling approximation differ by 
roughly .008. 


Before proving Stirling’s formula we will establish a weaker estimate for log(n!) than 
(1.1) that shows nlogn is the right order of magnitude for log(n!). After proving Stirling’s 
formula we will give some applications and then discuss a little bit of its history. Stirling’s 
contribution to Theorem 1.1 was recognizing the role of the constant J2r. 


2. WEAKER VERSION 
Theorem 2.1. For all n > 2, nlogn—n < log(n!) < nlogn, so log(n!) ~ nlogn. 


Proof. The inequality log(n!) < nlogn is a consequence of the trivial inequality n! <n”. 
Here are three methods of showing nlogn — n < log(n!). 
Method 1: A Riemann sum approximation for /f, : log x dx using right endpoints is log 2+ 
--»+logn = log(n!), which overestimates, so log(n!) > Au logrdz =nlogn—n-+1. 
Method 2: The power series expansion of e” is }7 ps n*/k!. Comparing e” to the nth 
term in the series gives us e” > n"/n!, so n! > n”/e”. Therefore log(n!) > nlogn — n. 
Method 3: For all k > 1, e > (14+1/k)*. Multiplying this over k = 1,2,...,n—1, we get 
ets nr l/(n—-1)!=n"/n!, so n! > en”/e”. Thus log(n!) > nlogn—n+1. 
Dividing through the inequality nlogn — n < log(n!) < nlogn by nlogn, we obtain 
1—1/logn < log(n!)/(nlogn) < 1, so log(n!) ~ nlogn. 


We won’t use Theorem 2.1 in the proof of Theorem 1.1, but it’s worth proving Theorem 
2.1 first since the approximations log(n!) + nlogn—n or log(n!) © nlogn are how Stirling’s 
formula is most often used in science. Large factorials occur when counting arrangements 
of gas particles or quantum particles in different macrostates, or files in data compression. 
While the lower order terms 4 logn + 5log(27) in (1.1) are irrelevant in most: scientific 
applications of (1.1), they are used in calculations in quantum field theory. 
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Remark 2.2. In the proof of Theorem 2.1, the first and third methods lead to upper bounds 
on log(n!) that are sharper than nlogn. By the first method, using left endpoints implies 
Bhs log «dx > log((n—1)!) = log(n!) — log n, which leads to log(n!) < nlogn—n+logn+1. 
(We can do slightly better with the trapezoid approximation, which is the average of the 
left endpoint and right endpoint approximations. It tells us, since log x is concave down, 
that f/logadx > 5(log(n!) + log(n!) — logn) = log(n!) — 5 logn, so log(n!) < nlogn — 
n+ 5 logn +1.) By the third method, the upper bound e < (1 +1/k)*t! multiplied over 
k =1,2,...,n—1 leads to e”"1 < n™/(n—1)! = n™*1/nl, so log(n!) < nlogn—n+logn+1. 


3. PROOF OF STIRLING’S FORMULA 


Any proof of Stirling’s formula needs to bring in a formula that involves 7. One such 
formula, which Stirling knew, is the Wallis product 
nm 22 4 4 6 6 
ao ae 
Another formula is the evaluation of the Gaussian integral from probability theory: 


(3.1) | eo? 2 dy = V2r. 


This integral will be how 27 enters the proof of Stirling’s formula here, and another idea 
from probability theory will also be used in the proof. 
To prove Stirling’s formula, we begin with Euler’s integral for n!. 


Theorem 3.1 (Euler). For n > 0, 


[oe) 
nt= | ve” dx. 
0 


Proof. We will use induction and integration by parts. The case n = 0 is a direct calculation: 


Io° e-* da = —e-*|8° = 0- (-1) = 1. If n! = fp? ae dx for some n, then 

[oe] co 

| gtle-® dx -|/ udv 
0 0 
where u = x”*! and dv = e~* dx. Then du = (n+ 1)x" dx and v = —e~*, so 
oo co oo 
[ a lede = wi| — | v du 
0 0 0 


ght 


ex 


oo ioe) 

+f (n+ 1l)e "x" dx 
0 0 
n+1 


= lim — ; 
boo e 


= (n+) f eve ai 
0 


which by induction is (n + 1)n! = (n+1)!.! 


[oe] 
+0+(n+1) f ge de 
0 


Let’s consider the graph of y = «"e~* for x > 0. By calculus, the graph has a maximum 
at x = n and inflection points at x =n+./n and «=n-— Jn. In Figure lisl1<n< 4. 

These graphs, for larger n, look somewhat like bell curves from probability theory. In 
probability, the density function of a normal random variable X with mean p and standard 
deviation o has its maximum at yz and inflection points at w+o and 4—<, and the random 


lUsing Theorem 3.1, n! = fo we de> [ete dt Sn" fe "de =n" fe” for n > 2, which is 
another proof of the lower bound log(n!) > nlogn — n in Theorem 2.1. 
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FIGURE 1. Plot of y=a2"e"* forl<n<4. 


variable Z = (X — y)/o is then normal with mean 0 and standard deviation 1. Considering 
the analogy u <n and o © /n, make the change of variables t = (a — n)/,/n in Euler’s 
integral for n!. This sends x =n tot=0and2=n+,/n tot = +1, so 


CO 
no= | xv"e * dx 
0 


i” (n+ Vnt)"e tv") /n dt 
(3.2) = wv “(+ =) ev™ dt. 


The terms extracted out of the integral in a) are exactly what appears in Stirling’s formula 
except for the factor 27, so to prove Stirling’s formula we will show 


cy" 2 
—Jnt —t?/2 
(3.3) (1 ~ <=) e >e 


as mn — oo, for each t, and then 


(3.4) ; 1+ = eve aS i oP /2 gy GD) V20 
—Vn vn —0o 


as n> Co. 


Remark 3.2. There are two proofs of Stirling’s formula in [3] using a sequence of random 
variables X,, with mean n and standard deviation \/n, and the change of variables T,, = 
(X;, — n)/./n. This same change of variables, in the form t = (x — n)/,/n, is used without 
motivation in the non-probabilistic proofs of Stirling’s formula in [17] and [18]. 


To prove (3.3) requires some care. If we handwave, as n — oo 


t\? 1 \ va vn 
—Vnt _ ~Viit ~, (gt\Vig—Viit _ 
(+5) “ ((1+ Fe) : eee a 


which is wrong. The mistake is that although (1+t//n)v” > e' as n > ov, it is not true 
after raising both sides to the \/n power that (1+t/./n)” behaves like eV": approximately 
equal numbers need not remain approximately equal when raised to a large power. 
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To write the integral in (3.2) over the whole real line, set 


halt) = 0, ift <— Jn, 
mee Veer, PS S/n, 
so n! = (n"/n/e”) f°. fr(t) dt. Figure 2 is a plot of y = f,(t) for 1 < n < 4 (solid) 


ae toy=e = a (dashed). The graphs suggest that as n + co, fn(t) > oP /2, 


FIGURE 2. Plot of y= f(t) forl <n<4and y= et? /2 


Theorem 3.3. For eacht € R, fn(t) > e-”/? asn— oo. 


Proof. We will show log f(t) + —t?/2 as n > oo. Since t is fixed in the limit calculation, 
we can focus on n that is large relative to |t|. For \/n > |t|, 7-e., n > t?, we have 
— (1 +t/ vn)” 
fn(t) = pe: 


SO 


leah) niles (1 + <=) Py 


For n > 4t?, |t/,/n| < 1/2. We have log(1 + x) = x — 27/2 + O(|z|?) for |z| < 1/2,? so 


n 2 2 
toe Fu(t)) =n (Se - OEP 4 oc(yiyiny®)) - vat =-5 + 018 /V). 


As n — 00, the O-term tends to 0, so the limit is —t?/2. 


To deduce from Theorem 3.3 that [~~ fn(t) dt > f°. e-t’/2 dt, which would finish the 
proof of Stirling’s formula by (3.4), we use the dominated convergence theorem: what is a 
positive integrable function on R dominating |f,| = f, for all n? By Figure 2, one such 
function should be 


eP/2 itt<0 fe ®/2, ift <0, 
g(t) = = ed 
fi(t), ift>0 (l+t)e", ift>0, 


which is positive and integrable on R. To prove 0 < f,(t) < g(t) for all n and ¢, it’s obvious 
for t < —,/n since f,(t) = 0. To prove f,(t) < g(t) if t > —/n, take logarithms: 


2, if —/n<t<0, 


log fn(t) = nlog (1+) - Paar = ate Se ESL. 


Jn 


In [18] it’s shown for |z| < 1/2 that log(1 +) = a — «7/2 + O(|a|?/3) where the O-constant can be 2. 
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2 
Case 1: log f,(t) < —t?/2 for —/n <t <0. 
We will show the difference 
2 


(3.5) log fn(t) + > = nlog (1 + 


t Ae i 
nt 4 
Jn 2 
for —\/n < t < 0 is increasing, so the fact that it vanishes at t = 0 implies it is negative for 
—/n <t<0. The derivative of (3.5) is 
n 1 i 
t= ——, 
1+t//nJ/n vee t+ J/n 


which is positive for —\/n < t < 0 since the numerator and denominator are both positive. 
? 


Case 2: log fn(t) < log(1 +t) —t for t > 0. This is trivial for n = 1, since log f(t) = 
log(1 + t) — t for t > 0. Thus we can take n > 1. We will show 


(3.6) log(1 + t) — t — log f,(t) = log(1 + t) —t — nlog (1 + <=) + Jnt 


for t > 0 is increasing, so the fact that it vanishes at t = 0 implies it is positive for t > 0. 
The derivative of (3.6) is 


1 1 —1)¢? 
Lee 1+t//nJ/n (¢+1¢+ Jn) 
which is positive since the numerator and denominator are positive when t > 0 and n > 2. 
Our proof of Stirling’s formula is now complete. 


Remark. Similar to the proof that f,(t) < g(t), we can prove what is suggested by 
Figure 2: fn4i(t) < fr(t) for t > 0 and fn4i(t) > fn(t) for —/n <t <0, 80 fp(t) 3 et? /? 
as n — co from below for t < 0 and from above for t > 0. (At t = 0 we have f,,(0) = 1 for 
all n.) For t > —/n, both f,(t) and fp+41(t) are positive and 


d t¥n+1 ty/n —V/n+ 1) 

F (log fn (8) — log fu(t)) = WR EE 4 vn Wine. 

dt t+tV¥ntl t+ Jn (t+Vn\(t+Vn4T) 

which is negative for t > —,/n except for being 0 at t = 0, so log(fn+1(t)/fn(t)) is decreasing 
for t > —./n. Since fr41(0)/fr(0) = 1, we get fr4i(t)/fa(t) > 1 for —/n < t < 0 and 
faar®)/ int) <1 for t > 0. 


4. APPLICATIONS OF STIRLING’S FORMULA 


Example 4.1. The probability that flipping a fair coin 2n times results in exactly n heads 
and n tails is (*")(4)?”. What is a good estimate for the size of this number? Writing (*”) 
(2n)! _ (2n)! 
min! ~ nl? ? 


by Stirling’s formula 


poe 


n 


(n” /e™)2(27n) ftn’ 
so (-") (5)°" ~ 1/./mn. This probability decays to 0 like 1/\/n. 

This same calculation occurs in the theory of random walks. Suppose a person moves 
around the d-dimensional lattice Z“ by jumping from any point to one of its 2d neighboring 
points (differing in one coordinate by +1), with a move in each of the 2d directions being 
equally likely. If such a random walk starts at the origin, will it return to the origin 
infinitely often? Polya proved that if d = 1 or 2 then the probability of returning to the 
origin infinitely often is 1, while for d > 3 this probability is 0. In picturesque language, a 
drunkard who stumbles away from a bar by walking along a road or in a street grid is almost 
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surely going to come back to the bar, but if he can fly (d = 3) then it’s no longer certain 
that he’ll return. The key to this result on random walks is that >. 1/n“/? diverges for 
d = 1 and 2, and converges for d > 3. To see the connection to the coin problem above, 
consider a random walk on Z (that is, d = 1). If a person starting at 0 takes steps left 
and right by 1 unit with probability 1/2 each, then the probability the person returns to 0 
after 2n steps (it’s impossible to return to 0 after an odd number of steps) is the probability 
of taking n left steps and n right steps, so the probability is ) Cue The asymptotic 
estimate 1/,/7n from Stirling’s formula tells us that the sum of these probabilities over all n 
diverges because )7,,., 1/,/n diverges, and this divergence leads to probability 1 that such 
events (returning to 0) occur infinitely often. Stirling’s formula can also be used to analyze 
random walks in Z? when d > 2. 


Example 4.2. Let’s determine the number of digits in 100!. The number of digits in a 
positive integer N is |log;)(N)| +1, so we want to compute |log;,(100!)| +1. From (1.1), 


log(100!) _ 100.5 log(100) — 100 + 5 log(27) 


l 100!) = a 
0810(100!) log(10) log(10) 


= 157.96. 


To pin down the approximation well enough to be sure that |log,,)(100!)| is 157 (and not 
158), we use a sharper form of Stirling’s formula having upper and lower bounds: 


12 n! 1/12n 


(n® Jer) ian ~° 


Taking logarithms, 


1 1 
0 < log(n!) — (n + 5) logn-—n+—= 5 log(2n) aaa 


Dividing by log 10, 


(n+ 1/2) logn — n + (1/2) log(27) 1 
log 10 12n log 10° 


0< logy9(n!) 


Let S,, be the term subtracted from log;9(n!) above. Since 1/(12n log 10) < 1/12log 10 ~ 
.036 < 1, |log;g(n!)]| is either |.S,,| or |.S,,| +1. 

Taking n = 100, the difference between log;,)(100!) and Si99 is bounded above by 
ToIoE TO x .00036. Since Si99 ~ 157.96... differs from its nearest integer by more than 
.00036, |log,9(100!)| = 157 and thus 100! has 158 digits. In a similar way, 1000! has 2568 
digits and 10000! has 35660 digits. 

Since 1/(12n log 10) — 0, heuristically we expect |log;)(n!)| = |.S,] most of the time, 
but in rare instances an integer falls between |log,(n!)| and |.S;,|. The first n where this 
happens, making |log;,9(n!)| equal to |.S,| +1 rather than |S,,|, is the 13-digit number 
n = 6,561,101,970,383. See [8]. 


Example 4.3. For each posse integer n, the volume V, of the unit ball in R” is 

n”/? /T(n/2+1), where I(t = > v'le-* de for t > 0, son! =I'(n +1). In one, two, and 
three dimensions the ca V, is 2, 7, and (4/3)z, which is increasing. But for large n, 
V, actually tends to 0. (In fact V;, is increasing for 1 <n < 5 and decreasing for n > 5.) 
For even n, V;, = 1”/?/(n/2)! so by Stirling’s formula V,, ~ ae which tends to 0 
as n — oo. The same asymptotic estimate holds for odd n using an extension of Stirling’s 
formula to the ['-function. 
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Example 4.4. The Bernoulli numbers B, are defined by a/(e* — 1) = S0,3,(Bn/n!)a”. 
They begin as 7 
By=1, By=-5, B= 7, By=0, Be=-z, - 
For odd n > 1, By, = 0. This sequence is important in number theory (values of the Riemann 
zeta-function and early work on Fermat’s last theorem depend on them), topology (counting 
exotic spheres), and numerical analysis (the Euler—-Maclaurin summation formula). Initial 
data suggest B, is small for even n, but this is misleading: |B,| — oo. For instance, 
|Boo| © 529 and |Bioo9| has 79 digits before the decimal point. Using Stirling’s formula 
and the Riemann zeta-function, we will give an asymptotic estimate on how large the even- 
indexed Bernoulli numbers are. 
For s > 1, the Riemann zeta-function ¢(s) = )¢,.,1/n* converges and Euler gave a 
formula for it at positive even integers: for each positive integer k, 


(277)?*| Box| 
2(2k)I 


As s > co we have ¢(s) — 1, so as k > oo Stirling’s formula tells us 


k 
\Bos| = 2(2k)I¢(2k) — 2(2k)!—-A(2k/e)?*/2m (Qk) _ ee e . 


(2n)2h ~~ (zr) 2h (27)? re 


Thus |Bo,| tends to oo very rapidly as k > oo. 


Bs =0, Bs = —, Br=0, .... 


¢(2k) = 


5. HISTORY OF STIRLING’S FORMULA 


Stirling’s formula first arose from correspondence between Stirling and DeMoivre in the 
1720s about DeMoivre’s work on approximating a binomial distribution by a normal dis- 
tribution. DeMoivre had essentially discovered the Central Limit Theorem for the normal 
approximation to the binomial distribution. The Central Limit Theorem is nowadays proved 
without Stirling’s formula, and for special types of probability distributions it leads to proofs 
of Stirling’s formula [7], [15], [20]. 

What DeMoivre showed in his work on approximating a binomial distribution was that 


2n 
ea sin =~) 2.1685 2er, with 2.168 being an approximation to a constant that DeMoivre 


could express only as an infinite series. For large n, (1 — 1/2n)?” » 1/e and /2n—1 
V2,/n, so DeMoivre’s approximation is essentially (2.168//2e)/\/n. Stirling found the 
“true” value of 2.168 to be e,/2/z, which turns DeMoivre’s approximation into 1/./7n, as 
we found in Example 4.1. 

Stirling’s treatment of approximations to log(n!) appeared in his book Methodus Differen- 
tialis [22], as Example 2 after Proposition 28. An English translation is in [23, pp. 149-151]. 
Stirling’s contribution to the asymptotic estimate for log(n!) that is named after him is in 
the identification of the constant in the formula as 3 log(27) = log V2m (without proof). 
How did he realize the constant involves 7? According to [13, p. 481], Stirling tabulated 
log(n!), interpolated the sequence to factorials of half-integers, and observed agreement of 
(—1/2)! with \/z to 10 decimal places. Tweddle [23, p. 271] suggests Stirling found the link 
with 7 through recognizing it in a numerical approximation or by a skillful use of the Wallis 
product for 7. In any case, DeMoivre proved @) a ~ 1/\/7n using the Wallis product for 
m in his book Miscellanea Analytica [6]. A discussion of DeMoivre’s work on this problem 
is in [13, Chap. 24]. 

Stirling’s formula in the form log(n!) = n log n—n+5 log(n) +log /2m+en, where én > 0 
as n — oo, can be refined to an asymptotic expansion called Stirling’s series that replaces 


8 


En 
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with a series in powers of 1/n. This expansion is due to DeMoivre; a version using a 


series in powers of 1/(m + 1/2) was found earlier by Stirling [12]. 


WN re 


Here is a summary of different ways that proofs of Stirling’s formula bring in 7. 
(1) The Wallis product for 7/2: [5], [10]. 
(2) The Gaussian integral: [3], [7], [11], [14], [15], [16], [19], [20], [21], [24]. 
(3) In [1, Sect. 2.5, Chap. 5], the proof uses the formula ['(z)['(1 — z) = m/sin(az). 
(4) In [2, p. 24] the proof uses ['(1/2) = ./7. 
(5) 
(6) 


[ 
In [4], the proof uses \/n/2”~! = [[p_, sin(km/2n). 
Tn [9], * comes from the formula [],,.)(1 — x? /n*) = sin(rx)/(n2). 
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